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Objective To investigate the inter-rater agreement using the Videofluoroscopic Dysphagia Scale (VDS). 
Method The present study was designed as a multicenter, single-blind trial. A Videofluoroscopic Swallowing 
Study (VFSS) was performed using the protocol described by J. A Logemann. Thick-fluid, pureed food, 
mechanically altered food, regularly textured food, and thin-fluid boluses were sequentially swallowed. Each 
participant received a 3 ml bolus followed by a 5 ml bolus of each food material, in the order mentioned above. All 
study procedures were video recorded. Discs containing these video recordings in random order were distributed 
to interpreters who were blinded to the participant information. The video recordings were evaluated using a 
standardized VDS sheet and the inter- rater reliability was calculated. 

Results In total, 100 patients participated in this study and 10 interpreters analyzed the findings. Inter-rater 
reliability was fair in terms of lip closure (k: 0.325), oral transit time (0.253), delayed triggering of pharyngeal 
swallowing (0.300), vallecular residue (0.275), laryngeal elevation (0.345), pyriform sinus residue (0.310), coating 
of the pharyngeal wall (0.310), and aspiration (0.393). However, other parameters of the oral phase were lower 
than those of the pharyngeal phase (0.06-0.153). Moreover, the summation of VDS reliability (intraclass correlation 
coefficient: 0.556) showed moderate agreement. 

Conclusion VDS shows a moderate rate of agreement for evaluating the swallowing function. However, many of 
the parameters demonstrated a lower rate of agreement, particularly the oral phase parameters. 
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INTRODUCTION 

Dysphagia is a frequent result of a stroke, brain tumor, 
or neurodegenerative disease. Many authors have tried to 
detect swallowing abnormalities (particularly aspiration) 
using non-radiographic observations, yet, these methods 
demonstrate poor sensitivity and specificity.^'^ The Video- 
fluoroscopic Swallowing Study (VFSS) has been the gold 
standard for evaluating patients with swallowing disor- 
ders for many years. ^'^ VFSS can detect oral, pharyngeal, 
and esophageal dysphagia; however, it demonstrates a 
limited ability in predicting the prognosis of dysphagia. 
Among many recent attempts to quantify and predict 
the prognosis of dysphagia, the Functional Dysphagia 
Scale (FDS), as reported by Han et al.^ in 2001, is a useful 
tool, correlating well with the ASHA-NOMS (American 
Speech-Language-Hearing Association National Out- 
comes Measurement System).'^ However, despite its value 
in explaining the severity of dysphagia, FDS does not pre- 
dict the long-term prognosis of dysphagia, which is im- 
portant due to the close relationship between prolonged 
dysphagia, lower respiratory tract infection, and high 
mortality.^'" 

The Videofluoroscopic Dysphagia Scale (VDS) can be 
used to predict the long-term prognosis of dysphagia pa- 
tients following stroke. Han et al.' define the long-term 
prognosis of dysphagia based on the occurrence of any 
aspiration/penetration event after 6 months from the on- 
set of dysphagia. VDS consists of 14 items with weighted 
values, and also shows good correlation with aspiration/ 
penetration occurring 6 months after the initial onset of 
dysphagia. The 14 items in VDS (Appendix 1) represent 
oral (lip closure, bolus formation, mastication, apraxia, 
premature bolus loss, and oral transit time) and pha- 
ryngeal (pharyngeal triggering, vallecular and pyriform 
sinus residues, laryngeal elevation and epiglottic closure, 
pharyngeal coating, pharyngeal transit time, and aspira- 
tion) functions that can be observed by VFSS. VDS can 
also express the severity of dysphagia in a quantifiable 
score; however, limitations regarding the subjectivity of 
its results have been noted in previous studies. Stoeckli et 
al.^° report high interobserver reliability for some of the 
parameters used to evaluate aspiration and penetration, 
but low reliability for other oral and pharyngeal phase 
parameters. Although their study did not evaluate VDS, 
it suggests that the results of VFSS can be subjective on 
several parameters. Since VDS is measured based on the 



findings of VFSS, the results may also be dependent on 
the observer; furthermore, there have not been any stud- 
ies on its inter- rater reliability of VDS. Therefore, in this 
study, we investigate the inter-rater reliability of VDS. 

MATERIALS AND METHODS 
Participants 

This study was designed as a multicenter (10 rehabilita- 
tion centers), single-blind trial. Patients who exhibited 
any symptoms of difficulty in swallowing were recruited. 
The criteria for inclusion were patients with (1) a history 
of aspiration symptoms, such as coughing or choking; 
(2) symptoms clinically suspicious of dysphagia, such as 
reduced gag reflex or delayed swallowing reflex; and (3) a 
history of the use of an alternative feeding method, such 
as a nasogastric tube. Patients who could not sit or those 
who had difficulty maintaining consciousness were ex- 
cluded. All of the recruited patients who agreed to partic- 
ipate in our study underwent VFSS from January through 
June in 2011. The protocol for this study was approved by 
the Institutional Review Board of Seoul Asan Hospital. 

VFSS protocol 

VFSS was conducted by two physiatrists using fluoros- 
copy. The first physiatrist was a professor with 15 years of 
experience with VFSS; the second physiatrist was a resi- 
dent physician. A modified version of the protocol used 
in Logemann's study" was used. The protocol consisted 
of 2 trials. The first trial was performed with the fluoros- 
copy projected from the lateral side of patient. Patients 
were asked to sit on a chair and then turn 90 degrees away 
from the fluoroscopy in order to form the lateral projec- 
tion position. Patients were given 3 and 5 ml boluses of a 
thick-fluid mixture that contained barium (the viscosity 
was above 1750 centipoise (cP) using a syringe, followed 
by 3 and 5 ml boluses of a pureed diet, mechanically al- 
tered diet, and regularly textured food using a spoon. All 
of the food samples were administered two times. The 
last step of the first trial consisted of 3 ml and 5 ml bo- 
luses of thin-fluid mixture with barium (the viscosity was 
1-50 cP) that was administered using a syringe; finally, 
2 drinks of a thin-fluid mixture was administered using 
a cup.^^ The second trial was performed as an anterior- 
posterior projection with the patient sitting in an upright 
position. During the second trial, patients were asked to 
drink a thin-fluid mixture from a cup. If there was a large 
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amount of aspiration, the study was aborted and the pa- 
tients were encouraged to expectorate the food material. 
All of the study procedures were recorded on AVI files 
(30 frames/second). After all of the patients finished the 
VFSS study, the video recordings were collected and each 
file was given a randomized number. Next, the files were 
copied to 10 DVDs with each DVD containing all of the 
video recordings in a different randomized order. The 
DVDs were sent to the interpreter for analysis. 

Interpretation 

All of the participating interpreters were physiatrists 
who had at least 5 years of experience in interpreting 
VFSS results. They agreed to participate after being in- 
formed of the nature of this study. All patient informa- 
tion, including age, sex, and underlying diseases, was 
withheld from the interpreters. The interpreters only 
observed the patients using the files on the DVD and 
described their findings using a standardized format (Ap- 
pendix 1). 

Statistical Methods 

The intra-class correlation coefficient (ICC) model 2.1 
of the VDS was calculated in order to test the inter-rater 
reliability based on the VDS scores provided by the inter- 
preters. The ICC model was used because it can be used 



not only for scale variables but also for ordinal variables. 
Ordinal variables equivalent to the weighted kappa ICC 
values over 0.80 was considered "very good" and ICC val- 
ues between 0.60-0.80 were considered "good" The con- 
sistency of the other items was evaluated using Cohen's 
kappa (k). 

RESULTS 

One hundred patients (59 males and 41 females) with 
dysphagia were enrolled, including 64 stroke patients, 
13 patients with traumatic brain injury, 12 patients with 
head and neck cancer, 6 patients with brain tumors, and 
5 patients with other diseases. The average age of the 
enrolled patients was 64.4±14.8 years. All of the recruited 
patients underwent VFSS. Inter- rater reliability of the oral 
phase parameters are shown in Table 1. All of the oral 
phase parameters demonstrated low reliability (k<0.4). 
Among the oral phase parameters, lip closure showed the 
highest reliability (k=0.325), whereas premature bolus 
loss and oral apraxia demonstrated the lowest reliabilities 
(k=0.060 and k=0.099, respectively). Table 1 also presents 
data on pharyngeal phase reliability. Pharyngeal phase 
parameters demonstrated higher reliability than the oral 
phase parameters, but the k value was below 0.4. Aspira- 
tion showed the highest reliability of all of the tested pa- 



Table 1. Inter- rater Reliability of VDS 





K 




SE 




95% CI 






Lip closure 


0.325 




0.024 


0.278 




0.373 




Bolus formation 


0.153 




0.023 


0.108 




0.197 




Mastication 


0.123 




0.025 


0.074 




0.171 




Apraxia 


0.099 




0.018 


0.063 




0.135 




Tongue to palate contact 


0.153 




0.024 


0.107 




0.200 




Premature bolus loss 


0.060 




0.016 


0.028 




0.092 




Oral transit time 


0.253 




0.026 


0.202 




0.303 




Triggering of pharyngeal swallowing 


0.300 




0.026 


0.250 




0.351 




Vallecular residue 


0.275 




0.015 


0.245 




0.304 




Laryngeal elevation 


0.202 




0.026 


0.152 




0.253 




Pyriform sinus residue 


0.345 




0.017 


0.312 




0.379 




Coating on the pharyngeal wall 


0.310 




0.026 


0.260 




0.361 




Pharyngeal transit time 


0.165 




0.026 


0.115 




0.216 




Aspiration 


0.393 




0.019 


0.356 




0.431 








ICC* 






95% CI 






Total score 


0.556 






0.463 




0.648 





SE: Standard error, CI: Confidential interval, ICC: Intra-class correlation coefficient 
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rameters (k=0.393). Total score reliability, in terms of the 
ICC, was 0.556. 

DISCUSSION 

The past two decades have brought an enormous wid- 
ening of our knowledge about dysphagia research and 
treatment.'^ The most valuable and frequently used 
diagnostic tool for the evaluation of dysphagia is VFSS. 
While the VFSS protocol has been standardized for use 
in many research projects," it also has a limited ability to 
predict dysphagia prognosis and provide the quantitative 
evaluation of dysphagia. Many physicians have tried to 
predict the long-term prognosis of dysphagia, and as a 
result, there are several studies on the long-term progno- 
sis of dysphagia after a stroke. Delayed oral transit time, 
penetration, age over 70 years, poor Barthel index, and 
the presence of a frontal and insular cortex lesion have 
been suggested to indicate poor prognosis. ^'"'^^ However, 
if the risk factors alone cannot explain the quantitative 
probability of poor prognosis of dysphagia, then, the VDS 
should be used to quantitatively investigate and predict 
the severity of dysphagia 6 months after the onset of a 
stroke.^ 

Overall, the VDS score demonstrated low to moderate 
reliability in our study (0.556 in terms of ICC). However, 
14 individual sub parameters, particularly the oral phase 
parameters, showed low reliability. A previous study 
conducted by Stoeckli et al.^° reported low oral phase 
reliability (k=0. 15-0.56); the highest value was for lip clo- 
sure (k=0.56). Lip closure also demonstrated the highest 
reliability in our study (k=0.35). Stoeckli et al.'" reported 
higher values than those of our study because lip closure 
was classified as a binary value ("yes" or "no") in their 
study, without any intermediate values. Lip closure on 
VDS has 3 categorical values ("intact" "inadequate" and 
"none"); however, "inadequate" lip closure lacks an ac- 
curate definition and can be defined arbitrarily by the 
interpreter depending on which food material is used as 
the standard for evaluation. For example, if the lip clo- 
sure of a patient was very good for a pureed diet but poor 
for the liquid diet, it might be classified as "inadequate" 
or "none" depending on which food material the inter- 
preter chose to use as the standard. 

Regarding the pharyngeal phase, the overall reliabil- 
ity was higher than the oral phase (k=0. 165-0.393 vs. 



K=0. 060-0. 325, respectively), similar to other studies that 
reported higher reliability for pharyngeal phase param- 
eters than oral phase parameters.'" This is because many 
pharyngeal phase parameters have two categorical values 
(e.g., the triggering of pharyngeal swallowing, laryngeal 
elevation, the coating of the pharyngeal wall, pharyngeal 
transit time). Also, the pharyngeal phase parameters can 
be clearly seen by the VFSS. Penetration was defined as 
the passage of material into the larynx, but not through 
the vocal folds, and aspiration was defined as the passage 
of material through the vocal folds."' These pharyngeal 
phase findings are relatively easier to differentiate than 
other oral phase findings. 

The total VDS score demonstrated higher reliability 
than the individual parameters (0.556 in terms of ICC). 
This is due to the dilution effect of the scores of each pa- 
rameter given by the interpreters. 

The overall reliability is not particularly high in our 
study, and we believe this is because no clear defini- 
tions exist for intermediate values VDS, even though 9 
of the 14 parameters have at least 3 categorical values. 
For example, "intact" mastication is given 0 points and 
"inadequate" mastication is given 4 points according to 
the VDS; however, depending on how each interpreter 
classifies the patient's mastication function, a single pa- 
tient can be given any point- - either 0 or 4. Therefore, the 
evaluation of patients showing some poor functioning of 
the parameters may lack consistency from interpreter to 
interpreter. Second, the guidelines specifying the type of 
food to be used as a standard for evaluation do not exist. 
In our study, various types of food material were tested 
on each patient. Depending on which type of material 
was used as the standard for evaluation, VFSS findings 
may be classified differently for each patient. For ex- 
ample, patients demonstrating good swallowing of solid 
foods but poor swallowing of liquid foods may be inter- 
preted differently depending on whether solid or liquid 
foods was used for evaluation. For future studies, there 
should be guidelines regarding which food materials 
should be used as the standard for evaluating the find- 
ings related to each parameter. 

This study has an obvious limitation. The interpretation 
was performed only via the observation of VFSS video re- 
cordings, as it was not logistically possible to have all 10 
interpreters examine each patient. If the interpreters had 
been allowed to clinically examine their patients, this 
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would have improved the results of the interpretations 
by increasing accuracy. However, the object of this study 
was to evaluate inter-rater reliability of VDS based on 
VFSS findings. If the interpreter had predicted the find- 
ings from the clinical examination, this would have acted 
as a bias. 

This is the first study to evaluate the inter-rater reliabil- 
ity of VDS. For future studies, a more precise and widely 
accepted study protocol will be needed. The develop- 
ment of such a protocol can be achieved by standardized 
education programs, such as interactive lecture movies 
or formal guidelines for interpreters. These education 
programs may contribute to achieving higher levels of ac- 
curacy in interpretation, and subsequently, to improving 
the abilities to predict the long-term prognosis of dys- 
phagia. 

CONCLUSION 

VDS demonstrates a moderate rate of inter-rater reli- 
ability for evaluating the swallowing function. Some of 
the parameters demonstrated a lower rate of agreement, 
particularly the oral phase parameters. VDS has some 
limitations in predicting the long-term prognosis of dys- 
phagia; hence, more accurate definitions of each param- 
eter as well as a study protocol will be essential. 
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Appendix 1. Videofluoroscopic Dysphagia Scale^ 



Parameter 


Findings 




Parameter 


Findings 




Lip Closure 


Intact 


0 


Triggering of pharyngeal swallow 


Normal 


0 




Inadequate 


2 




Delayed 


4.5 




None 


4 








Bolus formation 


Intact 


0 


Vallecular residue 


None 


0 




Inadequate 


3 




<10% 


2 




None 


6 




10-50% 


4 










>50% 


6 


Mastication 


Intact 


0 


Laryngeal elevation 


Normal 


0 




Inadequate 


4 




Impaired 


9 




None 


8 








Apraxia 


None 


0 


Pyriform sinus residue 


None 


0 




Mild 


1.5 




<10% 


4.5 




Moderate 


3 




10-50% 


9 




Severe 


4.5 




>50% 


13.5 


Tongue to palate contact 


Intact 


0 


Coating on the pharyngeal wall 


No 


0 




Inadequate 


5 




Yes 


9 




None 


10 








Premature bolus loss 


None 


0 


Pharyngeal transit time 


<1.0s 


0 




<10% 


1.5 




>1.0s 


6 




10-50% 


3 










>50% 


4.5 








Oral transit time 


<1.5s 


0 


Aspiration 


None 


0 




>1.5s 


3 




Penetration 


6 










Aspiration 


12 
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